Overview of gene expression at the bin level

Which bins are mostly expressed in the two conditions? We will integrate this info with phylogenetic placement.

The following heatmap shows expression levels (in TPMs) grouped by bin. You can hover each cell of the heatmap to get more info.

Differential gene expression

Aims:

  • Perform DGE
  • Group by bin
  • Group by taxonomy
  • Group by functional annotation

Methods

Use DESeq2 on salmon output.

Exploratory analysis and visualization

Prefilter the dataset (for visualization purposes only). Raw count data, how many genes?

## [1] 116479

We can keep rows (= genes) with more than 1 count across all samples. How many genes do we have now?

## [1] 77740

Pick a method to visualise sample relationships

Source: https://www.bioconductor.org/packages/devel/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#exploratory-analysis-and-visualization

We need to choose between transformation methods for running exploratory analyses.

Scatterplot of transformed counts from two samples.

Which samples are similar to each other, which are different? Does this fit to the expectation from the experiment’s design? We use the vst-transformed results to:

  • draw a heatmap of sample-sample distances
  • run a PCA analysis.

Heatmap of sample-to-sample distances using the variance stabilizing transformed values.

PCA analysis

Heatmap showing how much each gene deviates in a specific sample from the gene’s average across all samples. Top 20 genes are shown.